智能论文笔记

Statistical Mechanics of Generalization In Graph Convolution Networks

Cheng Shi , Liming Pan , Hong Hu , Ivan Dokmanić

分类：机器学习 | (统计)机器学习

2022-12-26

Graph neural networks (GNN) have become the default machine learning model for relational datasets, including protein interaction networks, biological neural networks, and scientific collaboration graphs. We use tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. The derived curves are phenomenologically rich: they explain the distinction between learning on homophilic and heterophilic graphs and they predict double descent whose existence in GNNs has been questioned by recent work. Our results are the first to accurately explain the behavior not only of a stylized graph learning model but also of complex GNNs on messy real-world datasets. To wit, we use our analytic insights about homophily and heterophily to improve performance of state-of-the-art graph neural networks on several heterophilic benchmarks by a simple addition of negative self-loop filters.

translated by 谷歌翻译

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Tianyi Tang , Junyi Li , Zhipeng Chen , Yiwen Hu , Zhuohao Yu , Wenxun Dai , Zican Dong , Xiaoxue Cheng , Yuhao Wang , Wayne Xin Zhao

分类：自然语言处理

2022-12-26

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.

translated by 谷歌翻译

Concealed Object Detection for Passive Millimeter-Wave Security Imaging Based on Task-Aligned Detection Transformer

Cheng Guo , Fei Hu , Yan Hu

分类：计算机视觉

2022-12-01

Passive millimeter-wave (PMMW) is a significant potential technique for human security screening. Several popular object detection networks have been used for PMMW images. However, restricted by the low resolution and high noise of PMMW images, PMMW hidden object detection based on deep learning usually suffers from low accuracy and low classification confidence. To tackle the above problems, this paper proposes a Task-Aligned Detection Transformer network, named PMMW-DETR. In the first stage, a Denoising Coarse-to-Fine Transformer (DCFT) backbone is designed to extract long- and short-range features in the different scales. In the second stage, we propose the Query Selection module to introduce learned spatial features into the network as prior knowledge, which enhances the semantic perception capability of the network. In the third stage, aiming to improve the classification performance, we perform a Task-Aligned Dual-Head block to decouple the classification and regression tasks. Based on our self-developed PMMW security screening dataset, experimental results including comparison with State-Of-The-Art (SOTA) methods and ablation study demonstrate that the PMMW-DETR obtains higher accuracy and classification confidence than previous works, and exhibits robustness to the PMMW images of low quality.

translated by 谷歌翻译

Protein Language Models and Structure Prediction: Connection and Progression

Bozhen Hu , Jun Xia , Jiangbin Zheng , Cheng Tan , Yufei Huang , Yongjie Xu , Stan Z. Li

分类：人工智能 | 机器学习

2022-11-30

The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding. Recent advances have proved the power of language models (LMs) in processing the protein sequence databases, which inherit the advantages of attention networks and capture useful information in learning representations for proteins. The past two years have witnessed remarkable success in tertiary protein structure prediction (PSP), including evolution-based and single-sequence-based PSP. It seems that instead of using energy-based models and sampling procedures, protein language model (pLM)-based pipelines have emerged as mainstream paradigms in PSP. Despite the fruitful progress, the PSP community needs a systematic and up-to-date survey to help bridge the gap between LMs in the natural language processing (NLP) and PSP domains and introduce their methodologies, advancements and practical applications. To this end, in this paper, we first introduce the similarities between protein and human languages that allow LMs extended to pLMs, and applied to protein databases. Then, we systematically review recent advances in LMs and pLMs from the perspectives of network architectures, pre-training strategies, applications, and commonly-used protein databases. Next, different types of methods for PSP are discussed, particularly how the pLM-based architectures function in the process of protein folding. Finally, we identify challenges faced by the PSP community and foresee promising research directions along with the advances of pLMs. This survey aims to be a hands-on guide for researchers to understand PSP methods, develop pLMs and tackle challenging problems in this field for practical purposes.

translated by 谷歌翻译

DATE: Dual Assignment for End-to-End Fully Convolutional Object Detection

Yiqun Chen , Qiang Chen , Qinghao Hu , Jian Cheng

分类：计算机视觉

2022-11-25

Fully convolutional detectors discard the one-to-many assignment and adopt a one-to-one assigning strategy to achieve end-to-end detection but suffer from the slow convergence issue. In this paper, we revisit these two assignment methods and find that bringing one-to-many assignment back to end-to-end fully convolutional detectors helps with model convergence. Based on this observation, we propose {\em \textbf{D}ual \textbf{A}ssignment} for end-to-end fully convolutional de\textbf{TE}ction (DATE). Our method constructs two branches with one-to-many and one-to-one assignment during training and speeds up the convergence of the one-to-one assignment branch by providing more supervision signals. DATE only uses the branch with the one-to-one matching strategy for model inference, which doesn't bring inference overhead. Experimental results show that Dual Assignment gives nontrivial improvements and speeds up model convergence upon OneNet and DeFCN. Code: https://github.com/YiqunChen1999/date.

translated by 谷歌翻译

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

Meng-Hao Guo , Cheng-Ze Lu , Qibin Hou , Zhengning Liu , Ming-Ming Cheng , Shi-Min Hu

分类：计算机视觉

2022-09-18

我们提出Segnext，这是一种简单的卷积网络体系结构，用于语义分割。由于自我注意力在编码空间信息中的效率，基于变压器的最新模型已主导语义分割领域。在本文中，我们表明卷积注意是一种比变形金刚中的自我注意机制更有效的编码上下文信息的方法。通过重新检查成功分割模型所拥有的特征，我们发现了几个关键组件，从而导致分割模型的性能提高。这促使我们设计了一个新型的卷积注意网络，该网络使用廉价的卷积操作。没有铃铛和哨子，我们的Segnext显着提高了先前最先进的方法对流行基准测试的性能，包括ADE20K，CityScapes，Coco-stuff，Pascal VOC，Pascal Context和ISAID。值得注意的是，segnext优于w/ nas-fpn的效率超过lavenet-l2，在帕斯卡VOC 2012测试排行榜上仅使用1/10参数，在Pascal VOC 2012测试排行榜上达到90.6％。平均而言，与具有相同或更少计算的ADE20K数据集上的最新方法相比，Segnext的改进约为2.0％。代码可在https://github.com/uyzhang/jseg（jittor）和https://github.com/visual-cratch-network/segnext（pytorch）获得。

translated by 谷歌翻译

From WSI-level to Patch-level: Structure Prior Guided Binuclear Cell Fine-grained Detection

Baomin Wang , Geng Hu , Dan Chen , Lihua Hu , Cheng Li , Yu An , Guiping Hu , Guang Jia

分类：计算机视觉

2022-08-26

准确，快速的双核细胞（BC）检测在预测白血病和其他恶性肿瘤的风险中起着重要作用。但是，手动显微镜计数是耗时的，缺乏客观性。此外，由于bc显微镜整体幻灯片图像（WSIS）的染色质量和多样性的限制，传统的图像处理方法是无助的。为了克服这一挑战，我们提出了一种基于深度学习的结构启发的两阶段检测方法，该方法是基于深度学习的，该方法是在斑块级别的WSI-Level和细粒度分类处实施BCS粗略检测的级联。粗糙检测网络是基于用于细胞检测的圆形边界框的多任务检测框架，以及用于核检测的中心关键点。圆的表示降低了自由度，与通常的矩形盒子相比，减轻周围杂质的影响，并且在WSI中可能是旋转不变的。检测细胞核中的关键点可以帮助网络感知，并在后来的细粒分类中用于无监督的颜色层分割。精细的分类网络由基于颜色层掩模的监督和基于变压器的关键区域选择模块组成的背景区域抑制模块，其全局建模能力。此外，首先提出了无监督和未配对的细胞质发生器网络来扩展长尾分配数据集。最后，在BC多中心数据集上进行实验。拟议的BC罚款检测方法在几乎所有评估标准中都优于其他基准，从而为诸如癌症筛查等任务提供了澄清和支持。

translated by 谷歌翻译

ExpansionNet v2: Block Static Expansion in fast end to end training for Image Captioning

Jia Cheng Hu , Roberto Cavicchioli , Alessandro Capotondi

分类：计算机视觉

2022-08-13

扩展方法探讨了深度学习方法中输入长度中性能瓶颈的可能性。在这项工作中，我们介绍了块静态扩展，该块静态扩展分布和处理输入，以与输入相比，以不同长度为特征的异质和任意大的序列集合。从这种方法中，我们引入了一种名为AspectionNet V2的新模型，该模型使用我们的新培训策略进行了培训，该模型不仅具有有效性，而且与最近的图像字幕中的标准方法相比，它的效率不仅快6倍。我们的新模型在MS-Coco 2014字幕挑战上实现了最先进的表现，在离线测试拆分中得分为143.7 Cider-D，在线评估服务器中的140.8 Cider-D和NoCaps验证集中的72.9 All-Cider。源代码可用：https：//github.com/jchenghu/expansionnet_v2

translated by 谷歌翻译

Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction

Yingzi Fan , Longfei Han , Yue Zhang , Lechao Cheng , Chen Xia , Di Hu

分类：计算机视觉

2022-08-10

视觉和听觉信息对于确定视频中的显着区域都是有价值的。深度卷积神经网络（CNN）展示了应对视听显着性预测任务的强大能力。由于各种因素，例如拍摄场景和天气，源训练数据和目标测试数据之间通常存在适度的分布差异。域差异导致CNN模型目标测试数据的性能降解。本文提前尝试解决视听显着性预测的无监督域适应问题。我们提出了一种双重域交流学习算法，以减轻源数据和目标数据之间的域差异。首先，建立了一个特定的域歧视分支，以对齐听觉功能分布。然后，这些听觉功能通过跨模式自我发项模块融合到视觉特征中。设计了其他域歧视分支，以减少视觉特征的域差异和融合视听特征所隐含的视听相关性的差异。公共基准测试的实验表明，我们的方法可以减轻域差异引起的性能降解。

translated by 谷歌翻译

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Qinghao Hu , Gang Li , Qiman Wu , Jian Cheng

分类：计算机视觉

2022-08-03

最近，低精确的深度学习加速器（DLA）由于其在芯片区域和能源消耗方面的优势而变得流行，但是这些DLA上的低精确量化模型导致严重的准确性降解。达到高精度和高效推断的一种方法是在低精度DLA上部署高精度神经网络，这很少被研究。在本文中，我们提出了平行的低精确量化（PALQUANT）方法，该方法通过从头开始学习并行低精度表示来近似高精度计算。此外，我们提出了一个新型的循环洗牌模块，以增强平行低精度组之间的跨组信息通信。广泛的实验表明，PALQUANT的精度和推理速度既优于最先进的量化方法，例如，对于RESNET-18网络量化，PALQUANT可以获得0.52 \％的准确性和1.78 $ \ times $ speedup同时获得在最先进的2位加速器上的4位反片机上。代码可在\ url {https://github.com/huqinghao/palquant}中获得。

translated by 谷歌翻译